Data Structures

Data structures are a concrete implementation of the specification provided by one or more particular abstract data types (ADT), which specify the operations that can be performed on a data structure and the computational complexity of those operations.

Different kinds of data structures are suited for different kinds of applications, and some are highly specialized to specific tasks. For example, relational databases commonly use B-tree indexes for data retrieval, while compiler implementations usually use hash tables to look up identifiers.

Usually, efficient data structures are key to designing efficient algorithms.

Standard import statement


In [1]:
from openanalysis.data_structures import DataStructureBase, DataStructureVisualization
import gi.repository.Gtk as gtk   # for displaying GUI dialogs

DataStructureBase is the base class for implementing data structures
DataStructureVisualization is the class that visualizes data structures in GUI

DataStructureBase class

Any data structure, which is to be implemented, has to be derived from this class. Now we shall see data members and member functions of this class:

Data Members

  • name - Name of the DS
  • file_path - Path to store output of DS operations

Member Functions

  • __init__(self, name, file_path) - Initializes DS with a name and a file_path to store the output
  • insert(self, item) - Inserts item into the DS
  • delete(Self, item) - Deletes item from the DS,
                if item is not present in the DS, throws a ValueError
  • find(self, item) - Finds the item in the DS
              returns True if found, else returns False
              similar to __contains__(self, item)
  • get_root(self) - Returns the root (for graph and tree DS)
  • get_graph(self, rt) - Gets the dict representation between the parent and children (for graph and tree DS)
  • draw(self, nth=None) - Draws the output to visualize the operations performed on the DS
                 nth is used to pass an item to visualize a find operation

DataStructureVisualization class

This class is used for visualizing data structures in a GUI (using GTK+ 3). Now we shall see data members and member functions of this class:

Data Members

  • ds - Any DS, which is an instance of DataStructureBase

Member Functions

  • __init__(self, ds) - Initializes ds with an instance of DS that is to be visualized
  • run(self) - Opens a GUI window to visualize the DS operations

An example ..... Binary Search Tree

Now we shall implement the class BinarySearchTree


In [2]:
class BinarySearchTree(DataStructureBase):                        # Derived from DataStructureBase
    
    class Node:                                                   # Class for creating a node
        def __init__(self, data):
            self.left = None
            self.right = None
            self.data = data

        def __str__(self):
            return str(self.data)

    def __init__(self):
        DataStructureBase.__init__(self, "Binary Search Tree", "t.png")       # Initializing with name and path
        self.root = None
        self.count = 0

    def get_root(self):                                            # Returns root node of the tree
        return self.root

    def insert(self, item):                                        # Inserts item into the tree
        newNode = BinarySearchTree.Node(item)
        insNode = self.root
        parent = None
        while insNode is not None:
            parent = insNode
            if insNode.data > newNode.data:
                insNode = insNode.left
            else:
                insNode = insNode.right
        if parent is None:
            self.root = newNode
        else:
            if parent.data > newNode.data:
                parent.left = newNode
            else:
                parent.right = newNode
        self.count += 1

    def find(self, item):                                           # Finds if item is present in tree or not
        node = self.root
        while node is not None:
            if item < node.data:
                node = node.left
            elif item > node.data:
                node = node.right
            else:
                return True
        return False
    
    def min_value_node(self):                                       # Returns the minimum value node
        current = self.root
        while current.left is not None:
            current = current.left
        return current

    def delete(self, item):                                         # Deletes item from tree if present
                                                                    # else shows Value Error
        if item not in self:
            dialog = gtk.MessageDialog(None, 0, gtk.MessageType.ERROR,
                                       gtk.ButtonsType.CANCEL, "Value not found ERROR")
            dialog.format_secondary_text(
                "Element not found in the %s" % self.name)
            dialog.run()
            dialog.destroy()
        else:
            self.count -= 1
            if self.root.data == item and (self.root.left is None or self.root.right is None):
                if self.root.left is None and self.root.right is None:
                    self.root = None
                elif self.root.data == item and self.root.left is None:
                    self.root = self.root.right
                elif self.root.data == item and self.root.right is None:
                    self.root = self.root.left
                return self.root
            if item < self.root.data:
                temp = self.root
                self.root = self.root.left
                temp.left = self.delete(item)
                self.root = temp
            elif item > self.root.data:
                temp = self.root
                self.root = self.root.right
                temp.right = self.delete(item)
                self.root = temp
            else:
                if self.root.left is None:
                    return self.root.right
                elif self.root.right is None:
                    return self.root.left
                temp = self.root
                self.root = self.root.right
                min_node = self.min_value_node()
                temp.data = min_node.data
                temp.right = self.delete(min_node.data)
                self.root = temp
            return self.root

    def get_graph(self, rt):                                            # Populates self.graph with elements depending
                                                                        # upon the parent-children relation
        if rt is None:
            return
        self.graph[rt.data] = {}
        if rt.left is not None:
            self.graph[rt.data][rt.left.data] = {'child_status': 'left'}
            self.get_graph(rt.left)
        if rt.right is not None:
            self.graph[rt.data][rt.right.data] = {'child_status': 'right'}
            self.get_graph(rt.right)

Now, this program can be executed as follows:


In [3]:
DataStructureVisualization(BinarySearchTree).run()




In [4]:
import io
import base64
from IPython.display import HTML

video = io.open('../res/bst.mp4', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" width="500" height="350" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))


Out[4]:

Example File

You can see more examples at Github